Association between Days Open and Parity, Calving Season or Milk Spectral Data

Simple Summary The days open was calculated as the difference between the dates of successful insemination and previous calving, which is a critical trait for the dairy industry. However, factors affecting days open were not well documented, and clarifying the mechanism of days open will contribute to the high-quality development of the dairy industry. Therefore, the objectives of this study are to evaluate the difference in days open between parity or calving season groups and assess associations of days open with wavenumbers ranging within the mid-infrared region. We found that the days open in calving season groups were significantly different, and dairy cows having a calving event in autumn showed the shortest number of days open. Weak associations were detected between days open and wavenumbers spanning the mid-infrared region. Based on the findings of the study, the reproduction management of dairy herds can be dynamically optimized according to the calving seasons, and the relationship between wavenumbers and days open needs to be further confirmed to service the dairy industry better. Abstract Milk spectral data on 2118 cows from nine herds located in northern China were used to access the association of days open (DO). Meanwhile, the parity and calving season of dairy cows were also studied to characterize the difference in DO between groups of these two cow-level factors. The result of the linear mixed-effects model revealed that no significant differences were observed between the parity groups. However, a significant difference in DO exists between calving season groups. The interaction between parity and calving season presented that primiparous cows always exhibit lower DO among all calving season groups, and the variation in DO among parity groups was especially clearer in winter. Survival analysis revealed that the difference in DO between calving season groups might be caused by the different P/AI at the first TAI. In addition, the summer group had a higher chance of conception in the subsequent services than other groups, implying that the micro-environment featured by season played a critical role in P/AI. A weak linkage between DO and wavenumbers ranging in the mid-infrared region was detected. In summary, our study revealed that the calving season of dairy cows can be used to optimize the reproduction management. The potential application of mid-infrared spectroscopy in dairy cows needs to be further developed.


Introduction
The days open (DO) was calculated as the difference between the dates of successful insemination and previous calving, and it was a critical trait with a complex genetic nature that contributes a lot to the total milk yield over lactation and further to the profitability of the farms [1][2][3][4]. The estimated heritability of DO reported in previous studies ranges from A total of 5000 milk samples from 2456 non-pregnant Holstein dairy cows were collected between November 2019 and December 2021 from 9 herds located in Hebei Province, China. The milk samples were analyzed by an infrared spectrometer (Foss Electric, Hillerød, Denmark) to obtain the spectra data. Overall, 1060 transmittance values of the spectra data were transformed into absorbance using the equation: absorbance = log 10 (1/transmittance). Then, the absorbance values were normalized to a null mean and a unit variance.
We calculated the standardized Mahalanobis distances for all spectra and preserved the spectra whose standardized Mahalanobis distances were not greater than three [38,47]. Milk fat and protein contents, which were predicted from the FT-MIR spectra, were also checked to be in the range of 1.5-9% and 1-7%, respectively [30]. Afterwards, the DO, calculated as the number of days between being conceived and previous calving, was required to be in the range of 25 to 400 d. To evaluate the fertility of dairy cows at early stages of lactation, only milk samples collected between 5 and 120 d of each lactation were retained. Finally, the total number of records used in this study was 3771 on 2118 cows from 9 herds. Table 1 shows the descriptive statistics for milk components and DO.

Difference in DO between Parity or Calving Season Groups
We performed a linear mixed-effects model and survival analysis to access the association between DO and parity or calving season. Survival analysis was performed on the records using the Kaplan-Meier estimator provided by the survival package in R [48]. A Mantel-Haenszel test was conducted to evaluate the difference between levels of factors [49]. The linear mixed-effects model was defined as: where DO represents days open; µ is an intercept; Parity i is the fixed effect of the ith parity (i = 1, 2, 3, ≥4); seasonC j is the fixed effect of the jth calving season (spring, March-May; summer, June-August; fall, September-November; winter, December-February); Parity i × seasonC j is the interaction between the ith parity and jth calving season. Herd k is the random effect of the kth herd (k = 1 to 9), and ε is the residual random error. The model was fitted using the lmer function provided by the lme4 package in R [50]. The effect was considered significant whenever the p-value related to the factor of interest was less than 0.05.

Associations between Days Open and Milk Spectral Data
Considering the high correlation between absorption values of adjacent wavenumbers (Figure 1), associations of DO with milk spectral data were assessed jointly by the model: where DO is the days open, and µ is an intercept. X w is the matrix representing all the 1060 wavenumbers, and β w is the corresponding vector of effects, which is assigned to a mixture (known as BayesB) of a point of mass at zero and a scaled-t slab prior [51]. X f is the matrix for the effects of parity, calving season, and DIM, and β f is the corresponding vector of effects, which is assigned to a flat prior. X r is the matrix for the effects of herds, and β r is the corresponding vector of effects, which is assigned to a Gaussian prior. ε is the residual random error item. The treatments for parity and calving season here were the same as the linear mixed-effects model, and four DIM intervals were created in a monthly manner (5-30 d, 31-60 d, 61-90 d, 91-120 d).
The prior density of BayesB is characterized by 3 hyper-parameters: π, which is subjected to beta distribution, represents the probability of non-null effects. df β , which is a user-defined hyper-parameter, represents the degrees of freedom, and S β , which was subjected to Gamma distribution, indicates the scale of the scaled Student's t-distribution. The extent of variable selection and shrinkage is controlled by these three hyper-parameters.
The BGLR function of the BGLR package in R was used to fit the data with the model defined above [52]. A total of 450,000 iterations of MCMC Gibbs sampling were performed with the first 100,000 iterations discarded as burn-in, and the remaining 350,000 circles were resampled with the thinning interval of 10 samples, thereby obtaining 35,000 samples for subsequent inference. Convergence of the models was inspected visually by trace plots. The prior density of BayesB is characterized by 3 hyper-parameters: π, which i jected to beta distribution, represents the probability of non-null effects. dfβ, whic user-defined hyper-parameter, represents the degrees of freedom, and Sβ, which wa jected to Gamma distribution, indicates the scale of the scaled Student's t-distribution extent of variable selection and shrinkage is controlled by these three hyper-parame The BGLR function of the BGLR package in R was used to fit the data with the m defined above [52]. A total of 450,000 iterations of MCMC Gibbs sampling were perfo with the first 100,000 iterations discarded as burn-in, and the remaining 350,000 c were resampled with the thinning interval of 10 samples, thereby obtaining 35,000 ples for subsequent inference. Convergence of the models was inspected visually by plots.

Marginal and Joint Associations between Wavenumbers and DO
Under the variable selection and shrinkage mechanism of BayesB, the associat the wavenumbers of spectra with DO can be evaluated by the posterior probabil association (PPA) as previously reported by [29,40].
All 1060 wavenumbers in milk spectral data were clustered using an adjacency strained hierarchical agglomerative clustering algorithm (HAC). The adjclust packa R was used to implement HAC, and the optimal number of windows was determin the slope heuristic method, as described by [53]. Both the marginal PPA and join were processed by min-max normalization to compare the relative importance a wavenumbers or clusters.

Marginal and Joint Associations between Wavenumbers and DO
Under the variable selection and shrinkage mechanism of BayesB, the association of the wavenumbers of spectra with DO can be evaluated by the posterior probability of association (PPA) as previously reported by [29,40].
All 1060 wavenumbers in milk spectral data were clustered using an adjacencyconstrained hierarchical agglomerative clustering algorithm (HAC). The adjclust package in R was used to implement HAC, and the optimal number of windows was determined by the slope heuristic method, as described by [53]. Both the marginal PPA and joint PPA were processed by min-max normalization to compare the relative importance among wavenumbers or clusters.

Difference in DO between Parity or Calving Season Groups
The least-squares means (LSM) of DO for dairy cows whose parity equal 1, 2, 3, and ≥4 was 101 ± 3.72 d, 107 ± 3.79 d, 107 ± 4.36 d, and 103 ± 4.71 d ( Figure 2C). No significant difference was observed between parity groups (p > 0.05). The DO of dairy cows whose calving seasons were spring, winter, summer and autumn showed a downtrend among parity groups except for primiparous cows ( Figure 2C). Table 2 and Figure 2B

Associations between Wavenumbers and DO
Consider the high correlation nature between absorption values of adjacent wavenumbers as shown in Figure 1, the relative effect of individual wavenumbers on DO was evaluated by an BayesB model ( Figure 3). In general, negative effect values, which imply negative correlations between absorbance of wavenumbers and DO, were considered favorable for farms.
Wavenumbers in the SWIR region from 5010.15 to 3671.80 cm −1 had a relatively stable and near-zero relative effect on DO. This means that molecules in the milk whose chemical bonds interact with wavenumbers in this region were barely related to DO. The SWIR-MWIR region and the MWIR-2 region spanning from wavenumber 3667.94 to 3050.83 cm −1 and 1697.05 to 1585.20 cm −1 , respectively, are characterized by the high variability of absorbances due to the absorption of water molecules. The effects of wavenumbers in the SWIR-MWIR region were evenly distributed around zero, but the effects of wavenumbers in the MWIR-2 region preferred to be slightly negative. A reasonable explanation for this phenomenon is that the high correlations between adjacent wavenumbers was extrapolated to the effects of wavenumbers on DO. Most wavenumbers located in the boundaries of the MWIR-1 region (wavenumbers from 3046.97 to 2641.99 cm −1 and 1978.60 to 1700.90 cm −1 ) had a negative effect on DO. The boundaries also contain absorption peaks for fat-B and fat-A, which range from 2873 to 2777 cm −1 and 1786 to 1725 cm −1 , respectively [54][55][56]. Moreover, the absorbance of wavenumbers ranging from 2306.44 to 2387.44 cm −1 was positively associated with DO. The MWIR-LWIR region, known as the fingerprint region, is an important region for the prediction of milk components [57]. Wavenumbers from 1546 to 1492 cm −1 represent the absorption peaks of proteins in milk [54,56], and these   (Figure 2A). DO of dairy cows was notably different between calving season groups (p < 0.05). The interaction between parity and calving season presented that primiparous cows always exhibit lower DO among all calving season groups, and the variation in DO among parity groups was especially clearer in winter (Figure 2A). Survival analysis ( Figure 2D) revealed that the difference in LSM of DO among calving season groups might be caused by the different P/AI at the first TAI. To our knowledge, the first TAI in herds adopting synchronization strategy mainly occurred at 50~70 DIM, which is overlapped with the time window that showed a drastic drop in the survival curve. Additionally, the summer group had a higher chance of conception in the subsequent services than other groups, implying that the micro-environment molded by season played a critical role in the P/AI. The median survival time was 103 d, 91 d, 67 d, and 68 d for spring, summer, fall, and winter calving groups, respectively (Table 2).

Associations between Wavenumbers and DO
Consider the high correlation nature between absorption values of adjacent wavenumbers as shown in Figure 1, the relative effect of individual wavenumbers on DO was evaluated by an BayesB model (Figure 3). In general, negative effect values, which imply negative correlations between absorbance of wavenumbers and DO, were considered favorable for farms.

Marginal and Joint Associations between Wavenumbers and DO
The posterior probability of association (PPA) was also used to evaluate the asso tions between wavenumbers and DO. The marginal PPA of 1060 wavenumbers, as as the joint PPA of windows clustered with highly correlated wavenumbers, were infe by the BayesB model.
A total of 42 wavenumber clusters were identified by the HAC algorithm, and number of wavenumbers per cluster ranged from 2 to 348 (Table 3). Since the wind were grouped based on the similarity of neighboring wavenumbers, we also visual all the windows on the correlation heatmap using colored solid-line rectangles, as sh in Figure 1. The five spectral regions labeled SWIR, SWIR-MWIR, MWIR-1, MWIR-2, MWIR-LWIR contain 1, 17, 9, 5, and 10 windows, respectively. In my view, the numb windows included in regions other than SWIR-MWIR and MWIR-2 can be considere an indicator of their information volume. It is well documented that these two spe regions are characterized by high variability in the absorbance and are responsible for absorption of water molecules.  Wavenumbers in the SWIR region from 5010.15 to 3671.80 cm −1 had a relatively stable and near-zero relative effect on DO. This means that molecules in the milk whose chemical bonds interact with wavenumbers in this region were barely related to DO. The SWIR-MWIR region and the MWIR-2 region spanning from wavenumber 3667.94 to 3050.83 cm −1 and 1697.05 to 1585.20 cm −1 , respectively, are characterized by the high variability of absorbances due to the absorption of water molecules. The effects of wavenumbers in the SWIR-MWIR region were evenly distributed around zero, but the effects of wavenumbers in the MWIR-2 region preferred to be slightly negative. A reasonable explanation for this phenomenon is that the high correlations between adjacent wavenumbers was extrapolated to the effects of wavenumbers on DO. Most wavenumbers located in the boundaries of the MWIR-1 region (wavenumbers from 3046.97 to 2641.99 cm −1 and 1978.60 to 1700.90 cm −1 ) had a negative effect on DO. The boundaries also contain absorption peaks for fat-B and fat-A, which range from 2873 to 2777 cm −1 and 1786 to 1725 cm −1 , respectively [54][55][56]. Moreover, the absorbance of wavenumbers ranging from 2306.44 to 2387.44 cm −1 was positively associated with DO. The MWIR-LWIR region, known as the fingerprint region, is an important region for the prediction of milk components [57]. Wavenumbers from 1546 to 1492 cm −1 represent the absorption peaks of proteins in milk [54,56], and these wavenumbers had a negative effect on DO. Three wavenumbers at 1461.77 cm −1 , 1238.07 cm −1 and 1110.79 cm −1 in this region exhibited a negative local minimum effect on DO, and these three wavenumbers have been reported to be related to the absorption of urea (1469 cm −1 ), fat (1460 cm −1 ), acetone (1238 cm −1 ), and lactose (1040 cm −1 ), respectively [54,56].

Marginal and Joint Associations between Wavenumbers and DO
The posterior probability of association (PPA) was also used to evaluate the associations between wavenumbers and DO. The marginal PPA of 1060 wavenumbers, as well as the joint PPA of windows clustered with highly correlated wavenumbers, were inferred by the BayesB model.
A total of 42 wavenumber clusters were identified by the HAC algorithm, and the number of wavenumbers per cluster ranged from 2 to 348 (Table 3). Since the windows were grouped based on the similarity of neighboring wavenumbers, we also visualized all the windows on the correlation heatmap using colored solid-line rectangles, as shown in Figure 1. The five spectral regions labeled SWIR, SWIR-MWIR, MWIR-1, MWIR-2, and MWIR-LWIR contain 1, 17, 9, 5, and 10 windows, respectively. In my view, the number of windows included in regions other than SWIR-MWIR and MWIR-2 can be considered as an indicator of their information volume. It is well documented that these two spectral regions are characterized by high variability in the absorbance and are responsible for the absorption of water molecules.  association between DO and wavenumber. Nevertheless, the wavenumbers with a la relative effect on DO also show stronger marginal PPA with DO; examples include wavenumbers located in the boundaries of the SWIR-1 region and the mentioned w numbers in the MWIR-LWIR region. In addition, except for the SWIR-MWIR and MW 2 regions, windows with slightly positive joint PPA are also located within the bounda of the SWIR-1 and MWIR-LWIR regions. In summary, the results of marginal PPA joint PPA were in line with the relative effects of wavenumbers.

The Limitations of FT-MIR When Considering Complex Traits
Since FT-MIR data have been successfully applied to predict detailed milk comp tion [29,[58][59][60][61], a large number of studies have been dedicated to developing the pote capabilities of FT-MIR such as the prediction of dairy cow health status [62][63][64], dry ma intake [65][66][67], weight [68,69], and heat production [70]. FT-MIR data also can be use increase GWAS power to map genes associated with traits studied [71][72][73]. Days ope a critical fertility trait, was also studied to determine its association with signals from MIR spectra. Although some of the wavenumbers exhibit considerable effects on DO

The Limitations of FT-MIR When Considering Complex Traits
Since FT-MIR data have been successfully applied to predict detailed milk composition [29,[58][59][60][61], a large number of studies have been dedicated to developing the potential capabilities of FT-MIR such as the prediction of dairy cow health status [62][63][64], dry matter intake [65][66][67], weight [68,69], and heat production [70]. FT-MIR data also can be used to increase GWAS power to map genes associated with traits studied [71][72][73]. Days open, as a critical fertility trait, was also studied to determine its association with signals from FT-MIR spectra. Although some of the wavenumbers exhibit considerable effects on DO, the relatively low variance explained by the BayesB models reduced the predictive ability of FT-MIR spectroscopy to DO [47]. A similar dilemma arises in our study, where no wavenumbers or clusters of wavenumbers were found to be particularly associated with DO. For one thing, various environmental factors dominated the phenotypic variance of traits of interest, making great challenges to the robustness of the prediction model based on milk FT-MIR spectroscopy, and for another, the complex traits are partly or totally indirect with the molecules in milk whose concentrations are typically low enough and even under the reference threshold of detection of 100 ppm [74]. This is undoubtedly another major obstacle to the generalization and application of milk FT-MIR spectroscopy.
Studies that have made some useful attempts to break these limitations have been reported. Ref. [38] adopted a differences spectra strategy to enhance the signal-to-noise ratio when predicting the pregnancy status of dairy cows. Ref. [75] analyzed the same milk sample twice and removed the wavenumbers whose absorbance changed notably between measures to improve the signal-to-noise ratio. Currently, artificial intelligence technologies, especially deep learning, are making breakthroughs in various fields. Ref. [76] showed that the average Gard-CAM weights based on deep learning models can be used to select signals with important information during the model development procedure. Overall, we should improve the signal-to-noise ratio to highlight the wavenumbers that are truly associated with the traits of interest, such as the days open.

Factors Affect Days Open and the Association of DO with Wavenumbers
We investigated the differences in DO between the parity and calving season groups, and the results suggest that significant differences exist between the calving season groups. Dairy cows that underwent calving events in the fall presented the shortest DO compared to other groups. This is line with previous studies by [77,78]. The former study reported that the DO values of spring, summer, and autumn were 5.93, −2.66, and −3.53, respectively, compared to winter. The latter study reported DO values of 115, 104, 99, and 104 for the spring, summer, autumn, and winter calving seasons, respectively. Our study also found that summer had a higher probability of conception in the subsequent services than other seasons, implying that the micro-environment molded by seasons played an important role in the P/AI of services.
To our knowledge, the study by [47] was the first one evaluating the association of days open with wavenumbers of milk spectral data. Different from the way they deal with DIM (divided into four stages as monthly intervals), we treated DIM as a fixed effect in the BayesB model by assigning a flat prior to it. We also take parity and calving season into consideration when developing the BayesB model. Although the models were defined differently, the wavenumbers identified to be associated with DO were located in the same region. Benefiting from partitioning data into four parts according to DIM, [47] found that the DIM period 31 to 60 d seems to be most informative about the fertility of the dairy cow and is directly related to the milk lactation curve and its energy requirements.

Conclusions
Overall, our results suggest that the calving season of dairy cows can be used as an assistant tool to help farms optimize their breeding manage for better profits. A number of wavenumbers were also identified to be associated with days open of dairy cows, such as the wavenumbers at the boundaries of the MWIR-1 region, which are responsible for the absorption peaks of Fat-A and Fat-B, respectively, and wavenumbers located in the MWIR-LWIR region, which are related to the absorption of urea (1469 cm −1 ), fat (1460 cm −1 ), acetone (1238 cm −1 ), and lactose (1040 cm −1 ). These milk components could be regarded as effective indicators of the effects of animal metabolic and physiological status on fertility. However, the relatively low impact of these wavenumbers on days open suggests that predicting days open using FT-MIR only will be challenging.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.